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Abstract 

We have calculated the probability that the clustering of arrival directions of ultra- 
high energy cosmic rays (UHECRs) is consistent with a finite number of uniformly 
distributed proton sources, assuming the case of small deflections by magnetic fields 
outside the Galaxy. A continuous source distribution is mimicked only by an un- 
realisticly high source density, n s S> 10~ 2 /Mpc 3 . Even for densities as large as 
n s = 10 _3 /Mpc 3 , less than half of the observed cluster are on average by chance. 
For the best-fit value n s = (1-4) x 10~ 5 /Mpc 3 derived from the AGASA data, the 
probability that at least one observed cluster is from a true point source is larger 
than 99.97%, while on average almost all observed clusters are true. The best-fit 
value found is comparable to the density of AGNs and consistent with the recent 
HiRes stereo data. In this scenario, the Pierre Auger Observatory will not only es- 
tablish the clustering of UHECRs but also determine the density of UHECR sources 
within a factor of a few after one year of data taking. 

PACS: 98.70.Sa 



1 Introduction 



The acceleration of protons or heavy nuclei to energies E > 10 19 eV is difficult 
for all known astrophysical sources of cosmic rays [1]. Therefore, one expects 
that only a small fraction of all cosmic ray (CR) sources is able to accelerate 
beyond E > 10 19 eV. The signature of a small number of ultra-high energy 
(UHE) CR sources is the small-scale clustering of their arrival directions, if the 
deflection of CRs in magnetic fields can be neglected. The structure and mag- 
nitude of the Galactic magnetic field can be estimated observing the Faraday 
rotation of the polarized radio-emission of pulsars. At energies E > 4x 10 19 eV, 
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the deflection of CR protons in this field is less than 4-6 degrees in most di- 
rections and decreases to 1-2 degrees at E > 10 20 eV [2]. The overall effect 
on potential CR clusters is even smaller when the energies of the CRs in the 
cluster are not too different. 

The magnitude and structure of extragalactic magnetic fields is more uncer- 
tain. Only recently, magnetic fields were included in simulations of large scale 
structures [3,4]. It was found that extragalactic magnetic fields are strongly 
localized in galaxy clusters and filaments, while voids contain only primordial 
fields. The latter cannot be stronger than B ~ 10~ n G, otherwise the ob- 
served field strengths in galaxy clusters would be exceeded. Even if sources 
tend to sit in regions of high density and thus strong magnetic fields, CRs can 
be significantly deflected only within clusters. But the angular size of distant 
galaxy clusters is much less than one degree and thus they appear as point-like 
sources unless a nearby cluster is on the line of sight. Thus in the part of the 
sky outside of nearby galaxy clusters astronomy with UHE protons may be 
possible, unless the observer is embedded in a strongly magnetized region. In 
the latter case, deflections are important even for UHE protons, and charged 
particle astronomy may not be possible [4] . A crucial step towards the goal of 
UHE proton astronomy is the identification of point sources of UHECRs. 

The AGASA data on the arrival direction of CRs with energies E > 4x 10 19 eV 
contain a clustered component with five pairs and one triplet within 2.5 de- 
grees [5,6]. Neglecting possible systematic errors in energy scales, the sensitiv- 
ity of the other experiments for clustering at the energies E > 4 x 10 19 eV is 
much smaller, either because of the smaller exposure at the highest energies 
(Yakutsk, HiRes in stereo mode) or because of a poor two-dimensional angular 
resolution (HiRes in monocular mode). At lower energies, E < 4 x 10 19 eV, 
when deflections by magnetic fields become more important, a clustered com- 
ponent still exists in the AGASA, Yakutsk and HiRes stereo data, but with 
reduced significance. 

The small-scale clustering of UHECR arrival directions has been discussed 
by various authors. These works can be divided into two main groups: The 
first one calculates the significance of the small-scale clusters [7,8,9,10,11], 
while the second group of works uses the data to estimate parameters like 
the density n s of sources or the strength of magnetic fields [12,13,14,15,16,17]. 
The authors of Ref. [12] pointed out, to our knowledge for the first time, 
that the observation of small-scale clusters allows to determine the number 
density of CR sources. In practice, the observed small-scale clusters of AGASA 
were used to estimate the number CR sources first in the pioneering work of 
Dubovsky, Tinyakov and Tkachev [13]. Previous analyses of the significance 
of the small-scale clusters observed by AGASA used a continuous distribution 
of sources as a test hypothesis. Such a distribution has the advantage of being 
model-independent and gives a lower limit on the significance of clustering 
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for a finite number of sources, as long as deflections by magnetic fields can 
be neglected. Here, we investigate the significance of the small-scale clusters 
within a realistic model of UHECR protons propagating from astrophysical 
sources distributed uniformly in the Universe. In particular, we calculate the 
number of true clusters, i.e. those with CRs from the same source, as function 
of n s . We show that the asymptotic limit of a continuous distribution of sources 
is reached only for an unrealistic high density of sources, n s ^> 10~ 2 /Mpc 3 , 
where the latter value corresponds to one source per galaxy. We estimate 
also the number density of CR sources assuming small deflections of CRs in 
galactic and extragalactic magnetic fields. Our Monte Carlo procedure is very 
similar to the one of Ref. [17]. We derive however confidence levels for the 
consistency of arbitrary source densities with the clustering observed by the 
AG AS A experiment, while Ref. [17] considered exemplary only three values 
for the source density. Moreover, our analysis shows strong deviations from 
Gaussianity for the probability distribution of the autocorrelation function. 
For the best-fit value n s = (1-4) x 1(T 5 /Mpc 3 derived from the AGASA data, 
the probability that at least one observed cluster is from a true point source 
is larger than 99.97%. For such densities, we predict that the Pierre Auger 
Observatory (PAO) [18] will be able to determine n s within a factor of ten at 
2 cr C.L. after one year of data taking. In the same time, the PAO will establish 
that clustering is not by chance at the at the 5 a level for any estimated source 
density smaller than n s = 10 -4 / Mpc 3 . 



2 Analysis of the AGASA and HIRES data 

The authors of Ref. [19] used first the angular two-point auto-correlation func- 
tion w discussing the significance of small-scale clustering in the arrival direc- 
tions of UHECRs. Since the signal of point-like sources should be concentrated 
around £y = 0, we restrict our analysis to the value of w in the first bin, 



where £{j is the angular distance between the two cosmic rays % and j, £± the 
chosen bin size, O the step function, and N the number of CRs considered. 

A draw-back of using only the first bin of the autocorrelation function w is 
the dependence of the results on l\. As a possible solution, one can perform 
a scan over different bin sizes and calculate the resulting penalty factor [20]. 
However, the result then still depends on the minimal and maximal bin size 
used in the scan: Choosing the scan range too large reduces the signal-to-noise 
ratio and thus diminishes the signal, while a too small range overestimates the 
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Fig. 1. Consistency level P of an uniform source distribution with the AGASA data 
as function of the density n s , with N = 57 (solid line) and N = 72 (dotted line) 
events for t\ = 2.5°, and with N = 57 (dashed line) for l\ = 5°. Lower limits on n s 
from the HiRes stereo data (arrows) and the density range of X-ray selected AGNs 
with X-ray luminosity L > 10 43 erg/s are also shown. 



signal. Following a different approach, we generated artificial data sets from a 
single point source, deflected them in the magnetic field, and finally smeared 
their arrival directions according to the angular resolution of the experiment. 
Then we chose the best binning size l\ such that the probability to observe 
an experimental value w\ by chance is minimized as function of l\. Here, w is 
the normalized auto-correlation function, 

2Q 



where f2 exp and f2bin denote the solid angle with non-zero exposure of the 
experiment and of the bin considered, respectively. Without the effect of the 
Galactic magnetic field, the optimal value for l\ found e.g. for the angular 
resolution of the AGASA experiment is l\ ~ 2°; including the effect of the 
Galactic magnetic field we found as optimal range of values l\ ~ 2 — 4°. 
Similar as for £i, we could try to find the optimal minimal energy -E^n of 
events taken into account. Earlier analyses found as penalty factor for the 
scan over E min in the AGASA data only a factor three [19,20]. 
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We generate sources with constant comoving density n s up to the maximal 
redshift z = 0.2; we have checked that the flux of sources further away is 
negligible above 4 x 10 19 eV. Then we choose a source i with equatorial coor- 
dinates R.A. and 6 at comoving radial distance Ri according to the declination 
dependent exposure of the experiment and the weight 
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Here, R mm and z m i n are the distance and redshift of the nearest source 
in the sample, respectively, and we have assumed the same luminosity for 
all sources. Then CRs are generated according to the injection spectrum 
dN/dE oc E~ a , where we fix a = 2.7 to reproduce best the AGASA en- 
ergy spectrum below the GZK cutoff. We propagate CRs until their en- 
ergy is below -Emm or they reach the Earth. In that case, we take into ac- 
count the energy- dependent angular deflection through the Galactic magnetic 
field and the angular resolution of the experiment. For the angular resolu- 
tion, we use a spherical Gaussian density oc exp(— £ 2 /(2af)) sin(£)d£ with 
c^/degree = max(0.8, -0.6 log(£/eV) + 13) for AGASA and the PAO, and 
ae/ degree = 0.4 for HiRes, respectively, and for the Galactic magnetic field 
we use a shift by ^/degree = 1.6 x 10 20 eV/(E cos(R.A.). 

The basic outcome of a sample of Monte Carlo simulations for fixed param- 
eters = n s , £i, . . . , is a binned distribution p(wi\$) for the values W\ of the 
auto-correlation function. With how much confidence can we accept or re- 
ject the hypothesis that the experimentally measured value w\ is drawn from 
p{w\\ , d)l In previous analyses, the test hypothesis was a continuous, isotropic 
distribution of sources on a sphere S 2 for which one expects lower values of Wi 
than measured. Therefore, the probability that w\ is consistent with pivo^d) 
was calculated as P > (wl, S 2 ) = J2iPi{ w i\S 2 )0 (wi — This asymmetric 
definition fails when one wants to reject both cases with too much and with 
too little clustering. We shall use as a more symmetric measure for the dis- 
crepancy between w\ and p{wi\ , d) the area between the measured value wl 
and the median W\/2 of the distribution 

In Fig. 1, we show P(wl,$) as function of n s for three different cases: The 
publicly available AGASA data set until May 2000 [21] (N = 57 or E mia = 
4 x 10 19 eV) for the two bin sizes h = 2.5° {w{ = 7) and h = 5° {w{ = 10), 
and the complete AGASA data set [22] (N = 72 or E min = 4 x 10 19 eV, bin 
size t\ = 2.5° and w\ — 8). Remarkably, the most likely value for the source 
density, n s = (1-4) x 10 _5 /Mpc 3 is stable against an increase of the data 
set and a change in bin size. A similar value for n s was found previously by 
the authors of Ref. [17], while earlier analyses [13,14] using only events above 
E = 10 20 eV obtained larger values for n s . The steep decrease of P^wl,^) for 
low n s excludes already now uniformly distributed sources with much lower 
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density than 10 6 /Mpc 3 . For comparison, we show also the estimated density 
of powerful AGNs with X-ray luminosity L > 10 43 erg/s in the energy range 
0.2-5 keV, n s ~ (1 - 5) x 10" 5 /Mpc 3 [23]. The density of Seyfert galaxies is 
about a factor of 20 higher. Note that most often only very specific subsets 
of AGNs with much lower densities are discussed as sources of UHECRs. On 
the other side, P{w s {\d) decreases only slowly for large n s . With the present 
AGASA data set it is therefore difficult to exclude large source densities. 

Recently, the HiRes collaboration published an analysis of their stereo 
data [24]. Their data set with N = 27 events above 4 x 10 19 eV contains 
no doublet within l\ = 2.5° and 5° [25]. Therefore, the HiRes data alone 
are consistent with a continuous source distribution. But since the number of 
events is small and p{wi\ , d) is a broad distribution, the HiRes data are also 
consistent with the best-fit value for n s from the AGASA data, at 53% and 
21% C.L. for l\ = 2.5° and 5°, respectively. In Fig. 1, we show also lower 
limits on n s for l\ = 5° from the HiRes stereo data. Similar conclusions were 
recently obtained in Ref. [11]. The HiRes data favor a larger value of n s than 
AGASA and may indicate that practically all Seyfert galaxies contribute to 
the CR flux above 4 x 10 19 eV. 

The effect of extragalactic magnetic fields on the above results is negligible, if 
the deflection is 2° on 500 Mpc propagation distance as found for a large part 
of the sky in Ref. [3] . The assumption of equal luminosity of all sources gives a 
lower bound on the possible number of sources [13]. A large additional popu- 
lation of faint sources cannot be excluded, if their contribution to the UHECR 
flux is sufficiently small. However, it is unlikely that any large population of 
sources can accelerate CRs to energies > 10 19 eV. 

Apart from the auto-correlation function w\ of the observed arrival direc- 
tions of CRs, i.e. including deflections and the finite experimental reso- 
lution, we can calculate also the auto-correlation function of the sources, 
W = Z)i=i Z)}=i with Sij = 1 when the two CRs are from the same 
source and 8y = otherwise. Using only simulations which reproduce the 
observed value defines p{W\w\, n s ). In Fig. 2, we show the probability 
-Ptruc to have a value of the auto-correlation function smaller or equal than W, 
Ptme(W) = J2w<w P(W'\wl, n s ) as function of n s for N = 57 events, w* — 7 
and l\ = 2.5°. Since the difference between the emitted and the observed di- 
rection of the CRs can be larger than £i, the values of W can exceed w\ for 
finite n s . 

The asymptotic behavior of PtmeiW) is easily understandable: For a single 
source, i.e. n s -> 0, P t me(W) -> and W = N(N - l)/2 for N observed 
events. On the other hand, for n s — > oo all clusters are by chance and thus 
Ptme(W) — > 1. A priori, it is unclear if for source densities typical for, e.g., 
AGNs the distribution P true is still close to its limiting value for n s —> oo or 
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Fig. 2. Probability Ptme to have a smaller or equal value of the source 
auto-correlation function W as function of the source density n s for l\ = 2.5°, 
57 events and w\ = 7. The line W = corresponds to the case that all observed 
clusters are by chance. The density range of X-ray selected AGNs with X-ray lu- 
minosity L > 10 43 erg/s is also shown. 

already strongly changed. We find that the limit Ptrue(W) — ^ 1 is approached 
only for unrealistic high source densities, n s ^> 10~ 2 /Mpc 3 , where the latter 
value corresponds to the density of ordinary galaxies. For smaller n s , the 
probability that at least one cluster observed by AGASA is real increases very 
fast and reaches 99.97% at n s = 2 x 10~ 5 /Mpc 3 . But even for densities as large 
as n s = 10~ 3 /Mpc 3 , less than half of the cluster are on average by chance. 



3 Prediction for the PAO 

The value of the auto-correlation function w\ is dominated by clusters with 
high multiplicities and, thus suffers from large cosmic variance. Moreover, the 
overlap of the distributions p(i&i|$) for different n s will be only very slowly 
reduced by collecting more data. We have found that the fraction of singlets is 
a more stable quantity against cosmic variance: singlet events can come from 
larger distances than multiplets and are thus less affected by variations of the 
source distributions. Therefore we propose to use the distribution p{Ni\'&) of 
the number of singlet events instead of p(wi\i9) to estimate n s from the PAO 
data. 
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Fig. 3. Predicted distributions p(Ni\'&) for source densities n s = 10 -6 , 10 -5 , 
10 _4 /Mpc 3 and for continuous source distribution after one year data taking of 
the PAO (N = 300). 

In Fig. 3, we show the distributions p(Nil'd) for N = 300 events expected in 
one year running of the PAO for n s = 10~ 6 , 10~ 5 and 10 _4 /Mpc 3 , together 
with the one for the limit n s — > oo. The distributions p{Ni\ , d) are characterized 
by power-law tails towards small N%, and these tails seem to prevent a clear 
separation of different densities. However, a very small fraction of singlet events 
is caused in most cases by a single nearby source producing clusters with very 
high multiplicity. We eliminate these exceptional cases by considering only 
samples where the highest multiplet is a 50-plet for N = 300. This would 
corresponds to 10 events from a single source in the case of AGASA. If indeed 
an exceptionally bright source would be found by the PAO, the region around 
this source should be excluded from the analysis. As an example, we show in 
Fig. 4 for n s = 2 x 10 _5 /Mpc 3 the probability with which the PAO estimates 
the value of n s . The elimination of clusters with very high multiplicity reduces 
the cosmic variance and thereby increases the precision of the estimate for n s . 
The influence of extragalactic magnetic fields on p(7Vi is clearly negligible, 
if the average deflection with « 2° on 500 Mpc is as small as found in Ref. [3]. 
If the average deflection is closer to the values found in Ref. [4], the average 
number of singlet events for a fixed number of sources would increase and 
become more and more indistinguishable from the case of an infinite number 
of sources. 

Finally, we want to estimate how well the PAO can establish that the clustering 
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Fig. 4. Confidence level for the estimate of the source density n s from the distribution 
p(Ni\$); without and with multiplicity cut, and with extragalactic magnetic field 
(N = 300 events) . The density range of X-ray selected AGNs with X-ray luminosity 
L > 10 43 erg/s is also shown. 

is not by chance, or equivalently, that the number of sources is finite. From 
an experimental point of view, the PAO will measure a certain value iy* of 
the auto-correlation function. From this measurement, one can estimate the 
density n s of sources. For N = 300 and n s = 2 x 10 -5 the mean of p(u>i|t?) is 
(w\) = 138. On the other hand, the largest value found in 10 6 simulations for 
a continuous distributions is w± = 67. Thus for any estimated source density 
smaller than n s = 10 _4 /Mpc 3 the PAO can establish clustering with chance 
probability smaller than 10~ 6 . The smallest value w\ compatible at 99% C.L. 
with a true density n s = 2 x 10~ 5 /Mpc 3 is only at 0.1% compatible with an 
infinite number of sources. 



4 Summary 

We have investigated the significance of the small-scale clustering of the ar- 
rival directions of UHECRs assuming a finite number of uniformly distributed 
proton sources and small deflection of CRs in extragalactic magnetic fields. 
The AGASA data favor as source density n s ~ 10 _5 /Mpc 3 , a value where the 
probability that at least one observed cluster is from a true point source is 
larger than 99.97%. Even for densities as large as n s = 10 _3 /Mpc 3 , less than 
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half of the cluster are on average by chance. 

At present, the sparse AG AS A data set cannot exclude firmly that the clus- 
tering is by chance without the prior knowledge of the source density. In con- 
trast, the PAO will confirm clustering from a finite number of point sources 
within one year of data taking at the the 5 a level for any source density 
n s < 10" 4 Mpc" 3 . The PAO will also measure the density of UHECR sources 
within a factor of a few, and check the assumption of uniformly distributed 
sources. If the PAO detects no significant clustering, then two possible expla- 
nations are that the extragalactic magnetic fields are, especially in the voids, 
larger than expected or that the UHECR primaries are nuclei. 
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